gguf : use Qn_K for k-quants instead of KQn #837

compilade · 2024-05-24T18:45:23Z

#822 (by @mofosyne) has introduced a naming convention for GGUF model files, but the way it names k-quants doesn't follow the established practice (all other places where k-quants are named use Qn_K where n is the number of bits per weight excluding the scales).

rg -i 'KQ\d' doesn't return anything related to quants except for this recently-added section, while
rg -i 'Q\d_K' returns a lot of things related to k-quants when run in ggml and llama.cpp repos

So this renames KQ2 to Q2_K, for consistency. This should avoid unnecessary confusion.

(note that the recently-added wiki page about "tensor encoding schemes" will need to be updated too, since it is the only other place I found to also use this KQ<X> naming scheme)

gguf : use Qn_K for k-quants instead of KQn

85a895a

ggerganov approved these changes May 24, 2024

View reviewed changes

ggerganov merged commit 8d6b703 into ggml-org:master May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gguf : use Qn_K for k-quants instead of KQn #837

gguf : use Qn_K for k-quants instead of KQn #837

Uh oh!

compilade commented May 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

gguf : use Qn_K for k-quants instead of KQn #837

gguf : use Qn_K for k-quants instead of KQn #837

Uh oh!

Conversation

compilade commented May 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants